Search CORE

107 research outputs found

The TREC2001 video track: information retrieval on digital video information

Author: Alan F. Smeaton
Alexander Hauptmann
Arjen P. De Vries
Cash J. Costello
David Doermann
Er Hauptmann
John R. Smith
Lide Wu
Mark E. Rorvig
Paul Over
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2002
Field of study

The development of techniques to support content-based access to archives of digital video information has recently started to receive much attention from the research community. During 2001, the annual TREC activity, which has been benchmarking the performance of information retrieval techniques on a range of media for 10 years, included a ”track“ or activity which allowed investigation into approaches to support searching through a video library. This paper is not intended to provide a comprehensive picture of the different approaches taken by the TREC2001 video track participants but instead we give an overview of the TREC video search task and a thumbnail sketch of the approaches taken by different groups. The reason for writing this paper is to highlight the message from the TREC video track that there are now a variety of approaches available for searching and browsing through digital video archives, that these approaches do work, are scalable to larger archives and can yield useful retrieval performance for users. This has important implications in making digital libraries of video information attainable

CiteSeerX

Crossref

CWI's Institutional Repository

Irish Universities

DCU Online Research Access Service

Clinical implication of expression of cyclooxygenase-2 and peroxisome proliferator activated-receptor γ in epithelial ovarian tumours

Author: A Ristimaki
AF Badawi
BM Forman
BP van Rees
C Denkert
CE Clay
CM Komar
DK Affney
E Elstner
E Giovannucci
E Mueller
ER Harris
G Ferrandina
H Asou
H Ikawa
H Inoue
H Sato
JA Keelan
JH Paik
K Shigemasa
K Subbaramaiah
K Yamamoto
LR Howe
MJ Thun
MK O'Banion
ML Martelli
MM Taketo
N Suh
P Sarraf
RA Gupta
S Hauptmann
S Li
SA Kliewer
SM Prescott
T Kubota
T Zander
WG Jiang
WL Smith
WL Smith
WL Yang
XM Xu
Y Matsumoto
Y Yokoyama
Publication venue: Nature Publishing Group
Publication date
Field of study

Crossref

PubMed Central

What’s news, what’s not? associating news videos with words

Author: Er Hauptmann
Pınar Duygulu
Publication venue
Publication date: 01/01/2004
Field of study

Abstract. Text retrieval from broadcast news video is unsatisfactory, because a transcript word frequently does not directly ‘describe ’ the shot when it was spoken. Extending the retrieved region to a window around the matching keyword provides better recall, but low precision. We improve on text retrieval using the following approach: First we segment the visual stream into coherent story-like units, using a set of visual news story delimiters. After filtering out clearly irrelevant classes of shots, we are still left with an ambiguity of how words in the transcript relate to the visual content in the remaining shots of the story. Using a limited set of visual features at different semantic levels ranging from color histograms, to faces, cars, and outdoors, an association matrix captures the correlation of these visual features to specific transcript words. This matrix is then refined using an EM approach. Preliminary results show that this approach has the potential to significantly improve retrieval performance from text queries.

CiteSeerX

Crossref

Bilkent University Institutional Repository

A Comparison of Speech vs Typed Input

Author: Er G. Hauptmann
Er I. Rudnicky
Publication venue
Publication date
Field of study

We conducted a series of empirical experiments in which users were asked to enter digit strings into the computer by voice or keyboard. Two different ways of verifying and correcting the spoken input were examined. Extensive timing analyses were performed to determine which aspects of the interface were critical to speedy completion of the task. The results show that speech is preferable for strings that require more than a few keystrokes. The results emphasize the need for fast and accurate speech recognition, but also demonstrate how error correction and input validation are crucial for an effective speech interface. 1

CiteSeerX

Automatic title generation for spoken broadcast news

Author: Er G. Hauptmann
Rong Jin
Publication venue
Publication date: 01/01/2001
Field of study

We implemented several statistical title generation methods using a training set of 21190 news stories and evaluated them on an independent test corpus of 1006 broadcast news documents, comparing the resulting titles based on manual transcription to the titles from automatically recognized speech. We use both F1 and the average number of correct title words in the correct order as evaluation metrics. The results show that title generation for speech-recognized news documents is possible at a level approaching the accuracy of titles generated for perfect text transcriptions.

CiteSeerX

Crossref

Improving Acoustic Models By Watching Television

Author: Er G. Hauptmann
Michael Witbrock
Publication venue
Publication date: 01/01/1998
Field of study

Obtaining sufficient labelled training data is a persistent difficulty for speech recognition research. Although well transcribed data is expensive to produce, there is a constant stream of challenging speech data and poor transcription broadcast as closed-captioned television. We describe a reliable unsupervised method for identifying accurately transcribed sections of these broadcasts, and show how these segments can be used to train a recognition system. Starting from acoustic models trained on the Wall Street Journal database, a single iteration of our training method reduced the word error rate on an independent broadcast television news test set from 62.2 % to 59.5%. This paper is based on work supported by the National Science Foundation, DARPA and NASA under NSF Cooperative agreement No. IRI-9411299. We thank Justsystem Corporation for supporting the preparation of the paper. Introduction Current speech recognition research is characterized by its reliance on data. The stati..

CiteSeerX